For the past 13 seasons, the American sports entertainment reality show American Ninja warrior has been taking the world by storm. The show brings on a variety of different contestents and challenges them to complete obstacles courses at various different rounds and stages. If the individual passes the round, they will move onto the next round. Well, we are planning on going on American Ninja Warrior and are trying to get a step up on the competition. Is there a way to predict the obstacles we will see on the courses? Are there certain rounds where different types of obstacles are possible? Well, lets find out.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
from sklearn import linear_model
from sklearn import datasets
from sklearn import metrics
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import f1_score,accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import statsmodels.api as smf
from statsmodels.formula.api import ols
data = pd.read_excel('American Ninja Warrior Obstacle History.xlsx')
data.head()
| Season | Location | Round/Stage | Obstacle Name | Obstacle Order | |
|---|---|---|---|---|---|
| 0 | 1 | Venice | Qualifying | Quintuple Steps | 1 |
| 1 | 1 | Venice | Qualifying | Rope Swing | 2 |
| 2 | 1 | Venice | Qualifying | Rolling Barrel | 3 |
| 3 | 1 | Venice | Qualifying | Jumping Spider | 4 |
| 4 | 1 | Venice | Qualifying | Pipe Slider | 5 |
Here we have a dataset that we were able to get from awesome-public-datasets on GitHub. This dataset contains all of the different obstacles that were present in the first 10 seasons of American Ninja Warrior. Within each row, we can see the name of the obstacle, the season the obstacle was in, the location of the round, the round/stage, and the number obstacle it was on the course. So to begin, we first want to see if there is even anything to be looking at. If all obstacles are only seen once, then there is no reason to be trying to plan what obstacles we might see.
obstacles = pd.DataFrame(columns=['obstacles','count'])
groups = data.groupby(['Obstacle Name'])
for name, group in groups:
if len(group)>10:
obstacles.loc[len(obstacles.index)] = [name, len(group)]
else:
obstacles.loc[len(obstacles.index)] = ["", len(group)]
plt.pie(obstacles['count'],labels = obstacles['obstacles'],textprops={'fontsize': 100})
plt.rcParams["figure.figsize"]=(200,200)
plt.show()